Skip to content

Conversation

@dishankj-max
Copy link

  1. Previously, when retrieving Nebius instances, the CloudCredRefID field was always set to the current client's credential ID (c.refID), not the ID of the credential that was originally used to create the instance.
  2. When creating an instance, we now store the creator's cloud credential ID in a new cloud-cued-ref-id label in the instance metadata
  3. When converting Nebius instances back to v1 format, we extract the cloud-cred-ref-id from the instance labels instead of using the current client's ID
  4. Backward compatibility - For instances created before this change (without the cloud-cred-ref-id label), the code falls back to the current client's ID with a warning log

@dishankj-max dishankj-max requested a review from a team as a code owner January 8, 2026 10:58
@patelspratik
Copy link
Contributor

patelspratik commented Jan 30, 2026

I've approved, but before I merge can you walk me through an example of what used to happen and what will happen now with the fix instead which would prevent this from being deleted by the DeleteOrphanWorkflow? I'm having trouble understanding the moving pieces.

@harshsharmanv
Copy link

I've approved, but before I merge can you walk me through an example of what used to happen and what will happen now with the fix instead which would prevent this from being deleted by the DeleteOrphanWorkflow? I'm having trouble understanding the moving pieces.
@patelspratik

We identified two root causes for the bug where Nebius instances were being deleted from dev environments:

  1. An incorrect tag for the stage (which we’ve addressed in this PR).
  2. The CloudCredRefID for Nebius instances wasn’t being persisted.

When we looked into how Nebius handles tags, we found that during instance creation, the CloudCredRefID label isn’t stored as a tag (reference). Additionally, when listing instances, Nebius always sets CloudCredRefID to the current client’s refID, not the creator’s (reference):

inst := &v1.Instance{
    RefID:          refID,
    CloudCredRefID: c.refID,  // ← Always uses the CURRENT client's refID, not the creator's!
}

In contrast, other providers like Shadeform store the cloudCredRefID as a tag when creating an instance and then read it from the tags when listing instances (for-ref, for-ref):

cloudCredRefID, found := tags[cloudCredRefIDTagName]
if !found {
    return nil, errors.WrapAndTrace(errors.New("could not find cloudCredRefID tag"))
}

Nebius had CloudCredRefID set to current CloudCredRefID, which allows these instances to bypass the CloudCredRefID check. We have resolved this bug in the current PR by ensuring that the correct CloudCredRefID is now properly stored and utilized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants